NEURAghe: Exploiting CPU-FPGA Synergies for Efficient and Flexible CNN Inference Acceleration on Zynq SoCs

نویسندگان

  • Paolo Meloni
  • Alessandro Capotondi
  • Gianfranco Deriu
  • Michele Brian
  • Francesco Conti
  • Davide Rossi
  • Luigi Raffo
  • Luca Benini
چکیده

PAOLO MELONI, Università di Cagliari, Italy ALESSANDRO CAPOTONDI, Università di Bologna, Italy GIANFRANCO DERIU, Università di Cagliari, Italy and T3LAB, Italy MICHELE BRIAN, T3LAB, Italy FRANCESCO CONTI, Università di Bologna, Italy and ETH Zurich, Switzerland DAVIDE ROSSI, Università di Bologna, Italy LUIGI RAFFO, Università di Cagliari, Italy LUCA BENINI, Università di Bologna, Italy and ETH Zurich, Switzerland

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Synergy: A HW/SW Framework for High Throughput CNNs on Embedded Heterogeneous SoC

Convolutional Neural Networks (CNN) have been widely deployed in diverse application domains. There has been significant progress in accelerating both their training and inference using high-performance GPUs, FPGAs, and custom ASICs for datacenter-scale environments. The recent proliferation of mobile and IoT devices have necessitated real-time, energy-efficient deep neural network inference on...

متن کامل

De-Mystifying accelerated Smart Vision Systems with All Programmable SoCs

Complex SoC devices, such as the Zynq® All Programmable SoC family from Xilinx®, are being chosen by designers for the next generation of smart and intelligent, embedded smart vision systems. SoCs offer new levels of processing acceleration that were not possible in older multi-chip architectures due to the abundant and tightly coupled connectivity of the ARM® Dual Cortex A9 processing system a...

متن کامل

Automated flow for compressing convolution neural networks for efficient edge-computation with FPGA

Deep convolutional neural networks (CNN) based solutions are the current stateof-the-art for computer vision tasks. Due to the large size of these models, they are typically run on clusters of CPUs or GPUs. However, power requirements and cost budgets can be a major hindrance in adoption of CNN for IoT applications. Recent research highlights that CNN contain significant redundancy in their str...

متن کامل

How to Break Secure Boot on FPGA SoCs Through Malicious Hardware

Embedded IoT devices are often built upon large system on chip computing platforms running a significant stack of software. For certain computation-intensive operations such as signal processing or encryption and authentication of large data, chips with integrated FPGAs, FPGA SoCs, which provide high performance through configurable hardware designs, are used. In this contribution, we demonstra...

متن کامل

A Design Methodology for Efficient Implementation of Deconvolutional Neural Networks on an FPGA

In recent years deep learning algorithms have shown extremely high performance on machine learning tasks such as image classification and speech recognition. In support of such applications, various FPGA accelerator architectures have been proposed for convolutional neural networks (CNNs) that enable high performance for classification tasks at lower power than CPU and GPU processors. However, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1712.00994  شماره 

صفحات  -

تاریخ انتشار 2017